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Abstract 

A simple graphical model for correlated defaults is proposed, with explicit 
formulas for the loss distribution. Algebraic geometry techniques are employed 
to show that this model is well posed for default dependence: it represents any 
given marginal distribution for single firms and pairwise correlation matrix. 
These techniques also provide a calibration algorithm based on maximum like- 
lihood estimation. Finally, the model is compared with standard normal copula 
model in terms of tails of the loss distribution and implied correlation smile. 



1 Introduction 

Credit risk concerns the valuation and hedging of defaultable financial securities. 
(See e.g. Bielecki and Rutkowski [I], Duffle and Singleton [15] . Bielecki et al. [5J, 
Lando [31], and the references therein). Since investors almost always engage in a 
range of different instruments related to multiple firms, successful modeling of the 
interaction of default risk for multiple firms is crucial for both risk management and 
credit derivative pricing. The significance of default correlation is highlighted by the 
current financial crisis. 



*Department of Industrial Engineering and Operations Research, University of California Berke- 
ley, CA 94720-1777. Email: onurf@ieor.berkeley.edu 

^Department of Industrial Engineering and Operations Research, University of California Berke- 
ley, CA 94720-1777. Email: xinguo@ieor.berkeley.edu 

■'"Department of Mathematics, Stanford University Palo Alto, CA 94305. 
Email: jason@math.stanford.edu 

§ Department of Mathematics, University of California Berkeley, CA 94720-3840. Email: 
bernd@math.berkeley.edu 



1 



There are a number of approaches for modeling correlated default. Collin-Dufresne 
et al. [9], Duffie and Singleton [Hj and Schonbucher and Schubert [JT] extend the 
reduced form models by assuming correlated intensity processes. Intensity-based 
models, however, tend to induce unrealistic levels of correlation. Hull et al. [25] . 
Hull and White [26J , and Zhou [48] take the structural form approach and use corre- 
lated asset processes, extending the classical framework of Black and Cox [3j. These 
models nevertheless imply spreads close to zero for short maturities, similar to their 
single-firm counterparts. 

Other approaches for default dependence include the so-called "contagion mod- 
els", where default of one firm affects the default process of the remaining firms. 
For example, Davis and Lo [12] use binary random variables for the default state of 
each firm, where these random variables are a function of a common set of indepen- 
dent identically distributed binary random variables. Jarrow and Yu [28] extend the 
reduced form setup by assuming that the intensity for the default process of each 
firm explicitly depends on the default of other firms, thus one default causes jumps 
in intensities of other firms' default processes. Giesecke and Weber [22] place the 
firms on the nodes of a mult i- dimensional lattice, and model their interaction by 
employing the voter model from the theory of interacting particle systems. The top- 
down approach, on the other hand, models the credit porfolio as a whole, focusing 
on the loss process rather than the processes of individual firms. Some examples of 
this approach include Giesecke and Goldberg [21 J, Errais et al. [17], Frey and Back- 
haus [IS] . Schonbucher and Ehler [40] and Sidenius et al. [42]. These models present 
their strength in situations where the modeling the individual firm process is not of 
primary importance: modeling index reference portfolios or when the firms are very 
small in comparison to the portfolio. 

The binomial expansion method [8], Credit Suisse's CreditRisk+ [I], and J. P. Mor- 
gan's CreditMetrics [23] are well-known approaches in the finance industry. While 
BET represents the loss distribution as a binomial random variable with the num- 
ber of trials in between the two extremes, CreditMetrics and CreditRisk+ focus on 
individual defaults. Distribution of the random variable representing the state of 
the firm is parameterized by a set of factors which are shared among the firms, but 
with varying weights. In contrast, copula models (see Schonbucher [39] for a general 
survey, and Li [33J, Vasicek [44] for the normal copula) separate the modeling of 
the interdependence of random variables from the modeling of their marginal dis- 
tributions. Though popular due to its tractability, normal copula suffers from two 
well-recognized deficiencies: a) it fails to produce fat tails observed in the credit 
derivatives market for the distribution of number of losses; b) the implied correla- 
tions in a normal copula for the equity and senior tranches of a Collateralized Debt 
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Obligation(CDO) are higher than those for the mezzanine tranches, a phenomenon 
known as the "correlation smile" . 

Our work In this paper, a simple graphical model for correlated defaults is pro- 
posed and analyzed (Section [2]). This model has an intuitive graphic structure and 
the loss distribution for its special one-period version is simply a summation of bino- 
mial random variables. This model is well posed in capturing default dependence in 
the following sense: it can represent any given marginal distribution for single firms 
and pairwise correlation matrix. Techniques from algebraic geometry are employed 
to prove this well-posedness and to provide a calibration algorithm for the model. 
Explicit formulas for the loss distribution and for CDO prices are derived. Finally, 
unlike the standard normal copula approach, this model can produce fat tails for loss 
distributions and correct the correlation smile (Section [3]). 

In addition to the proposal and analysis of a simple model for default correlation, 
one major contribution of this paper is the introduction of a new algebraic tech- 
nique to study inequalities implied in correlation structures. As correlation in any 
multi-variate probability distribution naturally leads to certain linear or non-linear 
inequalities, we are hopeful that this new tool will provide a powerful alternative 
to existing approaches such as copulas in the mathematical finance literature for 
analyzing default correlation. 

2 The Graphical Model for Defaults 

In this section a class of hierarchical models is formulated to model default risk for 
multiple names. The most generic form of the model is first presented, followed by 
a specialization to homogeneous parameterizations for ease of calibration and com- 
parison with existing models. To provide context, the simple terminology of "firms", 
"sectors" and "default" is adapted, although our model is applicable in any generic 
context with interaction between multiple entities. In the finance context it includes 
any type of asset backed security (ABS) on multiple names. For example, one can 
represent a Collateralized Mortgage Obligation (CMO) with our proposed graph 
structure, simply by replacing "sectors" , "firms" , and "default" with "geographical 
region" , "mortgage holder" , and "refinancing or default" respectively. 

2.1 General Form 

Take an undirected graph G = (V, E) with M nodes, and denote the set of nodes by 
V := {!,... , M}. The edge set E is a subset of (* f ) possible pairwise connections 
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between any pairs of nodes, i.e. E C {(u,v) : 1 < u < v < M}. Each node of the 
graph corresponds to a firm and has an associated binary random variable Xi, (i G V) 
with Xi = 1 representing the default of firm i and Xi = the survival. The joint 
probability distribution of the random variable X := (Xi, . . . , Xm) is given by 

p w (?7) := P(X = w) = — ■ exp j ^7/^ + ^ r] uv w u w v J (1) 



Here «; = (wi, . . . , wm) runs over {0, 1} M , the scalars rji G R and G R are 
parameters, and Z is the normalization constant known as the partition function: 

Z = ^ eX P ( + S VuvW u W v ) , 

ii>e{o,i} A/ iev («,«)eB 

It is worth mentioning that Kitsukawa et al. [30] and Molins and Vives [35J) 
have suggested using the long range Ising model (LRIM) in the credit risk context. 
However, these models are special cases of our formulation, and make use of physical 
concepts with no clear financial interpretation. They restrict the structure to a 
single-period model with a completely connected graph and assume that all edge 
interactions are homogeneous. We investigate heterogeneous connections and a sector 
model, analyze the multi-period setting, and provide pricing formulas. 

Well-posedness of the model Probabilistic models of the form (TjQ) are also known 
as Markov random fields, as Ising model in physics, or as graphical models in com- 
puter science [29] and statistics [32]. In the finance context, assessment of default 
correlation is usually assumed to identify the following two sets of sufficient statistics: 

• The marginal default probability P (Xi = 1) is known for each firm % G V. 

• The pairwise linear correlation 

P (X u = X v = 1) - P (X u =l) ■ P (X v =l) 

'P (X u =\) ■ (1 - P (X u =\)) ■ P (X v =l) ■ (1 - P (X v =l)) 



is assumed to be known for all pairs of firms u and v that share an edge in E. 

Therefore, we shall demonstrate that this model is well posed for modelling correlated 
default: for every set of marginal default probabilities and correlations, there exists 
a unique set of parameters matching that information. 

Clearly, data on the marginal default probabilities and the pairwise linear corre- 
lations is equivalent to the following set of M + \E\ sufficient statistics: 
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• The single node marginals Pi := P (Xj = 1) for all i G V. 

• The double node marginals P uv := P (X u = X v = 1) for all (w, i>) G 

Denoting this set of marginals by P, G [0, l] Af+ l- B l j we shall show: 

Theorem 1. Assume any given set of statistics P, from some probability distribution 
on M binary random variables. Then, there exists a unique set of parameters rji,rj uv 
such that the single and double node marginals implied by Equation (Q}) match Pi, P uv . 

The proof of Theorem [1] relies on techniques from algebraic geometry. The key 
ingredients of the proof are illustrated through a simple example to gain some insight. 
These ingredients are essential for model implementation as well (see Section I2TT1) . 

The first ingredient is an integer matrix Ac associated with the graph model. 

Example 2. Let G be the triangle with V = {1,2, 3} and E = {{1, 2}, {1, 3}, {2, 3}}. 
The marginals Pi, P uv are characterized by the following 16 linear equalities: 

Pi > Pij > for all i,j, 
Pi + P2 < P12 + 1 , Pi + Ps < Pis + 1 , P2 + Ps < P23 + 1 , (3) 

Pi + P23 ^ Pl2 + Pl3 1 P2 + Pl3 ^ Pl2 + P23 > P3 + Pl2 ^ Pl3 + P23 > 

and P l + P 2 + P 3 < P 12 + P 13 + P 23 + 1. 

These inequalities can be derived in the following way. First consider the expansion of 
marginal default probabilities in terms of elementary probabilities pooo,Pooi, • • • ,Pm' 

Pi ^ Q>iwPw 

™e{o,ip 

where a iw G {0,1}. Then construct a {0,1} valued matrix A G using the a iw val- 
ues, where each row corresponds to a marginal probability, whereas the columns 
correspond to the elementary probabilities. For the example, this matrix becomes 
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Then we have 

{P. : P, satisfies inequalities (j3J)} = {P, : P. = A G -p, p G and p w = 1}. 

™e{o,i} M 

In other words, the solution set of the 16 linear inequalities in ([3]) is the six- dimensional 
polytope which can be obtained by taking the convex hull of the columns of Aq. 

Remark 3. If G is the complete graph on M = 4 nodes then the corresponding 10- 
dimensional polytope is described by 56 facet-defining inequalities. In general, the 
number of facets of this polytope grows at least exponentially in M. See Wainwright 
and Jordan [J5] for an approach which carefully addresses these issues of complexity. 

In general, our graphical model can be represented as a toric model as in Geiger 
et al. [20] or Pachter and Sturmfels (371 §1-2] by defining the appropriate integer 
matrix Aq. This matrix represents the linear map which takes the vector of elemen- 
tary probabilities to the vector of marginals. To be precise the matrix Aq has 2 M 
columns and M + \E\ + 1 rows and its entries are in {0, 1}. The columns of Aq are 
indexed by the elementary probabilities 

Pw = Pw lW2 -w M = P{Xi = u>i, • • • , X M = w M ), w = (w 1} . . . , w M ) e {0, 1} M . 

All rows but the last are indexed by the marginals Pi for i 6 V and the correlations 
P uv for {u, v} G E. The entries in these rows are the coefficients in the expansion of 
the marginals in terms of the p w . The last row of Aq has all entries equal to one, 
and it corresponds to computing the trivial marginal ^„, e { nu Pw = 1- 

To be consistent with the algebraic literature, we replace the model parameters by 
their exponentials, thus obtaining new parameters that are assumed to be positive: 

9i := exp(?7j) for i G V and 9 UV := exp(r] uv ) for {u,v} G E. 

The model parameterization (CQ) now translates into the monomial form of [371 §1-2], 

p w = |-n^- n °uv w \ (5) 

iev (u,v)ge 

where the elementary probabilities are the monomials corresponding to the columns 
of Aq. The last row of Aq contributes the factor i. In multi-dimensional form, the 
function mapping parameters to the elementary probabilities is then defined as: 

/ : m^+\ E \ _> m 2 " i e ^ — i — (e ai , • ■ ■ , e a * M ) (6) 
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where 9 aj := n^li @i V ■ The model is the subvariety of the (2 M — l)-dimensional 
probability simplex cut out by the binomial equations 

Hp%> - hpz» = o, 

w w 

where C, D run over pairs of vectors in N 2M such that Ac ■ C = Ac ■ D. 

With the new notation, one can represent Example [2] in the following way. Aq 
is the 7x8-matrix obtained by augmenting (jlj) with a row of ones. The model 
parameterization ([5]) leads to 

(fooo , Pooi , Poio , Pon , PlOO , PlOl > Pno > Pm) 

= ( 1, #3 , #2 j ^2^3^23 ? Q\ , 0i9 3 9i 3 , 6>l6> 2 #12 , ®\ ^2^3^126'l3^23 ) 

The model is the hypersurface in the seven-dimensional probability simplex given by 

P000P011P101P110 = PooiPoioPiooPm- (7) 

Next, with this new notation, we introduce the second ingredient of the proof: 
Birch's theorem, which implies that this six-dimensional toric hypersurface ([7]) is 
mapped bijectively onto the six-dimensional polytope ([3]) under the linear map Aq- 

Theorem 4 (Birch's theorem). Every non-negative point on the toric variety spec- 
ified by an integer matrix A is mapped bijectively onto the corresponding polytope 
conv(A) under the linear map A. 

A proof of Birch's theorem can be found in Appendix |A] For a more complete 
treatment, see e.g. [371 Theorem 1.10]. Now we are ready to prove Theorem [U 

Proof of Theorem^ The linear map Aq maps the (2 M — l)-dimensional probability 
simplex onto the convex hull conv(Ac) of the column vectors of Aq. The mapping 
A —* conv(A) is usually referred to as the marginal map of a log-linear model in statis- 
tics (Christensen [7J), or moment map in toric geometry in mathematics (Fulton [T9l 
§4]). The convex polytope conv^c) therefore consists of all vectors of marginals 
that arise from some probability distribution on M binary random variables. Now 
applying Birch 's theorem to the matrix Aq yields the assertion of the theorem. □ 

As a corollary, we conclude by the Main Theorem for Polytopes [471 Theorem 
1.1, page 29] that the possible marginals P, arising from Equation (j2j) are always 
characterized by a finite set of linear inequalities as in ([3]). We also note that the 
above techniques, especially Birch's theorem, are instrumental for model calibration. 
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Calibration There are several algorithms for finding unique model parameters 
matching any given set of marginal default probabilities and correlations under the 
general formulation of Equation (TjQ). The calibration problem is equivalent to max- 
imum likelihood estimation for toric models. Indeed, suppose that one is given a 
data vector uGN 2 whose coordinates specify how many times each of the states in 
{0, 1} M was observed. This data gives rise to an empirical probability distribution 

with empirical marginals — A G -u = (P.). where An is defined as in Section 12. 1[ 

E«=i u i» 

The likelihood function of the data u is the following function of model parameters: 

R m+\e\ ^ R>0) v _> "Q pw ( 8 ) 

M)G{0,1} M 

Here p w (.) is defined as in Equation (TTJ. Thus, a direct consequence of Theorem [TJ is 

Corollary 5. The likelihood function has a unique maximizer fj. This is the 
unique parameter vector whose probability distribution implied by Equation (TJP has 
the empirical marginals (P.) . 

The key idea in the proof (given in Appendix [B]) implies that computing the max- 
imum likelihood parameters amounts to solving the following optimization problem: 

max -V p^-log^) (9) 

s.t. p w = exp '^rjiWi + VuvWuWv Vw G {0, 1} M (10) 

\iev {u,v)eE J 

A G -p = P. (11) 

Note that on the polytope of all probability distributions p with constraint (TTTT) . 
the objective function (Q as a function of p is strictly concave with its maximizer 
p being the distribution represented by fj. One can thus apply convex optimization 
techniques to solve the parameter estimation problem in our graphical model. In 
fact, this optimization problem is also known as geometric programming. See Boyd 
et al. [6J for an introduction to this subject. 

Parameter estimation in small toric models can be accomplished with the Iter- 
ative Proportional Fitting of Darroch and Ratcliff [10]; see Sturmfels [HI §8.4] for 
an algebraic description and a maple implementation. Such a straightforward imple- 
mentation of IPF requires iterative updates of vectors with 2 M coordinates, which 
is infeasible for larger values of M. To remedy this challenge, one needs to turn 
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to the large-scale computational methods used in machine learning. Popular meth- 
ods aside from the convex optimization techniques mentioned above include those 
based on quasi-Newton methods such as LM-BFGS [36], conjugate gradient ascent, 
log-determinant relaxation [15], and local methods related to pseudolikelihood esti- 
mation (particularly in the sparse case). 

2.2 One Period Model 

For both ease of exposition and numerical comparison with the existing models in 
literature, we now investigate more specialized forms of the formulation. 

First, we impose some structure on the graph. Take M = N + S, where nodes 
1, . . . , N represent individual firms and N+l, . . . , N+S represent individual industry 
sectors, so that the joint probability distribution for (Xi, ■ ■ ■ , Xjy) is defined as: 

P{Xi = xi, • • • ,X N = x N ) := Q{Xi = x 1} • ■ ■ ,X N = x N , 

s6{0,l} s 

Xn+i — si, ■ ■■ , Xj^+s — s n+s)- (12) 

Here the probability distribution Q on the right hand side is specified by Equation ([T|). 

Next, to capture the dependency among different industry sectors, we specify the 
parameters r\i and r\ uv as follows. We assume that each firm belongs to a partic- 
ular sector j = 1, 2, • • • , S such that firm nodes {1, . . . , A^} are partitioned into S 
subsets with Nj elements, i.e. iV = X^=i^Yr Moreover, a number of homogeneity 
assumptions are imposed for simplicity: 

• Each firm node i has a single edge, which connects to its respective sector node. 

• For any particular sector node j, all firm nodes that connect to it have the 
same node weight r}p, and same edge weight ijFSj- 

• Sector nodes are allowed to have different node weights rjs , and they can 
connect to each other with different edge weights r]N +Ut N+v 

In short, the probability distribution for (Xi, • • • , Xn) in Equation ( |T2l becomes 

P(X 1 = x 1 ,...,X N = x N ) = — ^2 ex P ( i>2 s i r]s i + s o n o r lFs j + njT] Fj J 

s se{o,i} s \j=i J 

exp Yl 

s u s vVN+u,N+v (13) 
V (u,vy.u,ve{l,- ,s} J 
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where rij := ^2 i:r) . N+ .^ %i is the number of defaulting firms in sector j, and Z$ is 
the normalization constant. Here a sector random variable X^ + j having value is 
interpreted as that sector being financially healthy and 1 as it being in distress. 

Note that if the graph G breaks up into various connected components, then the 
random variables associated with the nodes in each component are independent of 
each other. This property allows conditional independence structures to be easily 
incorporated into the model: when the state of all other other firms is fixed, two 
firms not connected by an edge will default independently of each other. Also note 
that, by allowing different parameters rjp. and i]ps for each sector, one can represent 
a diverse portfolio of firms, with possibly negative pairwise default correlations. 

Some simple calculation yields the loss distribution: 

Proposition 6. Given the model specified by Equation Iffi^) . 

P ( Xi = n ) = t~ exp I Z~2 S u S v T] N+UyN+v 

\i=l / S (n 1 ,...,n s ):J2jn j =ns&{0,l}S \(u,v):u,ve{l,- ,3} 

II ( eX P (W. + 8 3 n 3 T lFSi + "./'//•,) , (14) 

j \ 3 / 



where 



s 



z s = ex p(5>^ + E 1 U( 1+eVFj+SiVFSs ) Nj ■ 

se{0,l} s \i =1 (u,v):u,ve{l,- ,S} / j=l 

Connection to Binomial Distribution and Fat Tails Our model is related to 
binomial distribution with the simple observation that the probability distribution 
in Equation (f!4"l) can be decomposed into a summation of 2 ■ S independent binomial 
random variables. Indeed, note that when S — 1, 

P{X 1 = ,X N = x N ) = — (e^ E ^ + e w+(iws+w)£,*i) ? ( 15 ) 

P X i = n J = Y (^j [ enVF + e r ' s+nr ' F+nr ' FS ] (16) 

Z 1 = {l + e^ F ) N + e ns (1 + e r > F+r ' FS ) N , (17) 
where T] FSl ,r]s 1 ,r] Fl are replaced by r]ps,r]s,i]F respectively for notational simplicity. 
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Figure 1: A 12-firm graph with three sector nodes: Square nodes represent the sectors 
and circle nodes represent firms. 



Proposition 7. Given the model HUty , when S = 1, 



N 



J2X 1 = YB 1 + (1-Y)B 2 , 



where Y, B 1 , B 2 are independent and distributed as 



Y ~ Bernoulli 



e VS (l + e VF 6 VFS 



B\ ~ Binomial 



B 2 ~ Binomial 



(- 



V 1 + e^ F e^ F s 



N 



5 x ' I ; 



1 + e"' 



iV 



(18) 



Corollary 8. Under the assumptions of Proposition^ 

Xi = YRi + (l-Y)Ui, with 



R; 



Bernoulli 



( 



e VF e VFS 



\1+ t^FfftFS J ' 



Uj ~ Bernoulli ( 



?VF 



V 1 + e^F J ' 



(19) 
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and Ri, . . . , Rn, U\, 



,Ujf,Y are mutually independent. Moreover, 



Con(X 1 ,X 2 ) 



Corr(FVi + U u YV 2 + U 2 ) = Corr(YVi, YV 2 ) 
E[y 2 ]E[Vi]E[V 2 ] - E[y] 2 E[Vi]E[V 2 ] _ Vai(Y)E[V] 2 



% /Var(yy 1 )^Var(yV A 2 ) V&i(YV) 
Var(YE[V}) 
Var(y^) 



(20) 



a 



where V = Vj, := R{ — U{ and independent of all other random variables. 

One implication of Proposition [7] is that one can have control over the tails of 
the loss distribution. Of the two binomial random variables, varying the parame- 
ters affecting the center of the higher mean random variable to increase(decrease) 
its mean results in thicker(thinner) tails. Moreover, the loss distribution may be 
bimodal. Bimodality can be explained by a 'contagion' effect among firms. Hav- 
ing a high number of defaults may make it more likely for "neighboring" firms to 
default. This phenomenon enables our model to correct the so-called "correlation 
smile" (e.g. Amato and Gyntelberg [2], Hager and Schobel [23]) in pricing CDOs, 
since a low probability for mezzanine level defaults naturally lead to lower spreads 
for the respective tranche. These will be illustrated in detail in Section 13.21 

2.3 Multi-period Model 

In this section, we shall extend the one-period model to a multi-period one. This 
extension is essential for pricing defaultable derivatives and for comparison with 
standard copula models. 

The construction is as follows: 

• Start with a single-sector graph with N firms. At each payment period tk, the 
graph evolves by the defaulting of some nodes. Furthermore, some of the previ- 
ously defaulted nodes are removed. Economically, removal of nodes represents 
that these firms are no longer influencing or providing useful information about 
the default process of other firms. Therefore, the number of firms remaining 
in the system is dynamic, and is denoted by N tk . Denote the number of firms 
that have defaulted up to by D tk . Then, D = and N = N. 

• Each defaulted node "stays" in the system for a geometrically distributed num- 
ber of time steps (with "success" or "removal" probability pr), independent of 
everything else. This is equivalent to removing each defaulted node from the 
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system with probability pn, independent of everything else, at the beginning 
of tk- Thus, the number of nodes that are currently in default and still in the 
system at time %, It k , is given by: 

h k = D tk + N tk -N 

• Number of additional defaults D tk+1 — D tk during the period (tk,tk+i) is based 
on a conditioning of the probability distribution specified by Equation f fT6l) . 
More specifically: 

P (Dt k+1 ~ D tk = n \ I tk = m; N tk ) 

■= y~ y J p (x h = ■■■ = X in = 1, x in+1 = ■ ■ ■ = x iN ^ = 

iiv ,i„:ij6{m+l,-- ,Nt k } 

\X 1 = --- = X m = l), n + m<N tk (21) 

This construction, together with some simple calculation, leads to 
Proposition 9. 

p ( D t k+ i ~ D t k = n | h k =m;N th ) = P (n; N tk -m,r] S + mr) FS , Vfs, Vf) 



where 



P (n; N, 7jg, r] FS , r] F ) := P X i = 



as defined by Equation jjTS\) and 



as defined by Equation p7\ >. 



2.3.1 Simulation and CDO pricing 

Based on the above proposition, it is easy to see that this multi-period model can 
be simulated as follows. At time 0, the graph has iV non-defaulted firms, At time 
ti, no removal of nodes occurs since none of the firms were in default at time 0. The 
number of firms that default during period (0, ti), D\ — D = D\ is determined by 
sampling from Equation (fl6l) . At time t 2 , each of the D% nodes is removed from the 
graph with probability pr. The additional number of defaults D 2 — D\ is determined 
by Equation ( |2T1) . where N 2 is the total number of firms remaining after the removal, 



13 



and I t2 is the number of firms among D\ that have not been removed from the graph. 
Continue in this fashion until K periods are covered. 

Moreover, the homogeneity assumptions for the one-period single-sector model 
imply that our model can be perceived as a two-state discrete time Markov chain, 
as (D tk+1 , N tk+1 ) only depends on (D tk ,N tk ). Indeed, the transition matrix P of the 
Markov chain is given by: 

P ( ^ V^-^+i/i „ \Dt k -N+N tk+1 

P(D tk ,N tk )^(D tk+1 ,N tk+1 ) = \M h _ Nf PR 

P(Dt k+1 - D tk - N tk - I tk ,rj s + I tk rj FS , Vfs , rj F ) (22) 
N > D tk+1 >D tk >0 and N > N tk > N tk+1 > 

with D = 0, N = N. This Markov chain formulation is useful for analytical calcu- 
lation of loss distribution and CDO prices. 

For purposes of model comparison in the later sections, we briefly discuss pricing 
of CDOs in our model. 

Collateralized Debt Obligation(CDO) Pricing A Collateralized Debt Obliga- 
tion (CDO) is a portfolio of defaultable instruments (loans, credits, bonds or default 
swaps), whose credit risk is sold to investors who agree to bear the losses in the port- 
folio, in return for a periodic payment. A CDO is usually sold in tranches, which are 
specified by their attachment points K F and detachment points Kjj as a percentage 
of total notional of the portfolio. The holder of a tranche is responsible for covering 
all losses in excess of Kl percent of the notional, up to K\j percent. In return, the 
premiums he receives are adjusted according to the remaining notional he is responsi- 
ble for. In the case of popularly traded tranches on the North American Investment 
Grade Credit Default Swap Index (CDX.NA.IG), the tranches are named equity, 
mezzanine, senior, senior, super-senior with attachment and detachment points of 
- 3, 3 - 7, 7 - 10, 10 - 15, 15 - 30 respectively. 

Given an underlying portfolio, and fixed attachment and detachment points for all 
tranches, the pricing problem is the determination of periodic payment percentages 
(usually called spreads) s\ for all tranches, assuming the market is complete and 
default-free interest rate is independent of the credit risk of securities in the portfolio. 

If we denote the total notional of the portfolio by M, the periodic payment dates 
by ti, . . . , tx, the date of inception of the contract by t := 0, payment period tk+i—tk 
by 7, the total percentage of loss in the portfolio by time t^ by C tfc , the attachment 
(detachment) point for tranche I by (K^), and the discount factor from t to tk 
by (3(to, t^, then it is clear that specifying the distribution for C tk for k = 1, . . . , K 
is sufficient for pricing purposes. 
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To see this, note the percentage of loss C^ t suffered by the holders of tranche I 
up to time t is given by: 

Q, t := min{C t ,K Vl }-imn{C t ,K Ll }. (23) 

Consequently, the value at time t of payments received by the holder of tranche / is 

K 

Y,P{to,tk)*aME[K Vl - K Ll - C M J. (24) 

k=l 

Similarly, the value at time to of payments made by the holder of tranche / is given 
by 

K 

^/3(t ,t fe )ME[Q )tfc -a, tfc _J. (25) 

k=l 

In order to prevent arbitrage, the premium si needs to be chosen such that the value 
of payments received is equal to the value of payments made. Therefore, 

ELjM (E[C Mfc ]-E[C Mfc J) 

Now our focus is to calculate the distribution for C tk in our multi-period model. 
Denoting the /c-step transition matrix (in Equation (j22p ) for the Markov chain with 
P k and the number of losses at the fc-th step by L k , then 

N 

P ( Ctfc = 9) = P ( L " = m ) = E P (o,iv)-Kn) , (27) 

n=0 

and the spreads are given by 

Proposition 10. Given the yield curve (3, attachment (KlJ and detachment (KjjJ 
points, and the implied Markov transition matrix P, the spread of tranche I is 

Ef = i /?(*o, h) (gLo (min{f , K Ut } - min{f , K Ll }) ^HM ~ P (U-^.n)) 

Ef=i Pfa, h)l (K Vl - K Ll - ELo Hin{f , Jfy} - min{f , K Lt }) P^ N) ^ n) ) 



si 
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3 Sensitivity Analysis and Comparison with One- 
Factor Normal Copula 

In this section, some numerical results on the proposed model are reported, and they 
are compared with both the static(one-period) and dynamic(multi-period) one-factor 
normal copula model. Throughout the section, S = 1 is assumed for simplicity. 

3.1 Static Characteristics 

Correlation for single-period model First, we analyze the effects of the param- 
eterization triplets {tis,Vf,Vfs) on the correlation between two firms corr(Xi, Xj). 
Note the following statement 

Proposition 11. Given fixed r]s,rjFs andq G (0,1), therjp value that gives P (X± = 1) = 
q in Equation fT5\) is given by 



Take q = 0.01,0.05 and N = 125. Figures [2HS] show the correlation values for 
the four quadrants on r]s and t]fs- For each point Proposition [11] is utilized for 
calculating the rf F value that achieves the desired q value. Note that r)ps is the 
dominant parameter when i]ps values are close to 0. As rjps moves away from 0, 
r]s's effect increases. Also note that it is possible to obtain high degrees of linear 
correlation levels even for q = 0.01. This extends the abilities of multi-firm extensions 
of intensity-based models in literature (see [SSJ §10.5] for a discussion). 

Loss distribution for single-period model Figure M exhibits the shape and fat 
tails of the loss distribution for different parameters. Here iV =125, rjps — -2.1, 
P (Xi) =0.05, with rjs calibrated to match the given correlation level p and rjp to 
match the given marginal default probability. The figure shows the loss distribution 
for correlation levels 0.01, 0.02, 0.05, 0.07. As expected, the mass shifts towards 
the tail as correlation increases. All the distributions are bimodal, which facilitates 
having significantly fat tails. 



g(e^) + e^g (. 



e VFS e v p) = 



where 
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(a) q=0.01 (b) q=0.05 

Figure 2: Variation of p for fixed marginal default probabilities, rjs < 0,t]fs > 





(a) q=0.01 (b) q=0.05 

Figure 3: Variation of p for fixed marginal default probabilities, rjs < 0,r]ps < 

Heavy Tails for Multi-period Model Figure [7] shows the effect of different 
parameterizations on loss distributions in a multi-period model. Take N =50 firms 
and {rjs,f]FSiVF)= (5.514, -5, -2.76). The parameters are chosen to correspond to 
the single-firm default probability of 0.005 and default correlation of 0.05. The loss 
distribution is then calculated after 5 and 10 steps, with the removal probabilities 
Pr = 0.1, 0.3, 0.5, 0.999. Note that by varying pn, tail of the loss distribution can 
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(a) q=0.01 (b) q=0.05 

Figure 4: Variation of p for fixed marginal default probabilities, r] S > 0,r] FS > 





I 



(a) q=0.01 (b) q=0.05 

Figure 5: Variation of p for fixed marginal default probabilities, rjs > 0,t]fs < 

be controlled as shown in Figure [7J Increasing results in thinner tails, whereas 
lower pr values can obtain quite fat tails. As the removal probability increases, the 
mass is shifted towards fewer defaults. This is intuitive, since higher p^ values lead 
to defaulted firms staying in the system shorter and thus having less detrimental 
effect on financially healthy firms. Moreover, the tails become very significant as the 
number of steps increase, despite a relatively low single-step default probability of 
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Number of defaults 



Figure 6: Distribution of number of defaults under different correlations, r]ps =-2.1, 
T] S , tjf varying 



0.005. Figure 7(c) shows the importance of choosing a suitable single step default 
probability. High values for this quantity results in rather strong shifts of mass 
towards high number of defaults. Therefore, for a fixed maturity, whenever the 
number of steps is increased, the single step default probability has to be scaled 
down accordingly. 



3.2 Comparison to Normal Copula 

In this section, we compare our model with the widely used one-factor normal copula 
model of [33] . Two important attributes are discussed: the heaviness of the tails in 
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5 10 
Number of defaults 



(a) Single step loss distribution 




(b) 5 steps (c) 10 steps 

Figure 7: Evolution of number of defaults for varying disappearance probabilities, 
r] F =-2.8, 775=5.514, 7] FS =-5, p=0.05, single step default probability 0.005, 50 firms 



loss distribution, and the correlation smile in pricing standard tranches. 

Heavy tails for One-period Model One well-known deficiency of normal copulas 
with a constant correlation parameter for all pairs of firms is the thinner "tails" in 
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the loss distribution than observed from market data. In comparison, Proposition [7] 
suggests that the loss distribution based on our model can achieve fatter tails. 

To demonstrate this, first recall that in a copula model, the default indicator Xi 
for firm i is given by Xi — I{Mj < K} where 

M{ = s/pIY + y^p^ei, ie{l,---,N} (28) 
Y, €i ~ Normal(0, 1) i.i.d. 

pA = corr(Mi, Mj) is the asset correlation, assumed to be the same for all pairs 
of firms. Note that K 6 K implicitly specifies the marginal default probability 
P (Xi = 1) which is assumed to be the same across all firms. Note also that the 
asset correlation pa is different from the default correlation p = corr(Xi, Xj) which 
is given by: 

9 §>(K) 2 (\ - <$>(K)f 

This is an important distinction, as asset correlation values for the normal copula 
result in significantly different values of default indicator correlation. 

For comparing loss distributions implied by the two approaches, take N =125, 
P (Xi = 1) = 0.05, and two levels of p = 0.01, 0.05. For the normal copula, these p 
values lead to to pyi=0.042, 0.18 respectively. For our model, take (vfsiVSiVf) = 
(-0.95, 9.2, -2.2) and ( -2.1, 15, -2) respectively. These parameters are chosen so as to 
match the specified P (Xi = 1) and p. For both levels of correlation, as demonstrated 
in Figure [8] our model exhibits fatter tails and has smaller loss probabilities for 
intermediary values. Furthermore, the values for the loss distribution are of the 
same scale. All these properties help in correcting the deficiencies of the normal 
copula when pricing CDO's, as demonstrated next. 



Correlation Smile For the normal copula, the pricing scheme of [33] and Hull 
and White [27j is utilized, where the default time r, for a firm is defined through a 
transformation of Mj in Equation (|28|) . More specifically, the risk- free interest rate 
r and the recovery rate R are taken to be constants, Tj is assumed to be distributed 
exponentially with rate A, and Mj is mapped to r { using a percentile-to-percentile 
transformation so that for any given realization 

. = -i„(i-» ( A,» (2g) 

The spreads si are then calculated by simulating Mj values and replacing the expected 
values in the pricing formula in Equation f)26p by their respective estimators. 
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Graphical 
Normal 



Graphical 

Normal 



^ i i i i i i ^ i i i i i 

5 10 15 20 25 30 5 10 15 20 25 30 

Number of defaults Number of defaults 

(a) p = 0.01 (b) p = 0.05 

Figure 8: Comparison of one-factor normal copula and single sector one-period graph- 
ical models 

Recall that given a standard tranche on CDX.NA.IG, with given observed spread 
Si, and known r, R, A, it is possible to "imply" the asset correlation parameter p^ in 
Equation (|28|) . However, it is known (e.g. [2], [23]) that implying p^ in such a manner 
across all tranches results in a "smile": The mezzanine tranche has lower implied 
correlation compared to the neighboring tranches. One plausible interpretation for 
this kind of smile is that the normal copula model underprices the senior tranches 
and overprices the equity tranche in comparison to the mezzanine tranche. 

We now demonstrate that our model has the potential to correct this smile. To 
achieve this, first we calculate prices from normal copula. We then find parameters 
(Vf,Vfs,Vs,Pr) such that our model matches the mezzanine tranche spread exactly 
with those from normal copula while giving significantly lower spreads for the equity 
tranche and higher spreads for the senior tranches. 

More specifically, take two different credit rating classes, representing high and 
low credit ratings respectively, so that 

• the one-year default probabilities are set at 0.001 for high-rating class and 0.015 
for low-rating class, 

• the asset correlation values for the normal copula are 0.2 and 0.3 (these values 
correspond to default indicator correlations of 0.0059 and 0.0562 ), 

• the recovery rate is 0.4, the interest rate 0.05, and iV=50, 
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• the maturity of the CDO is 5 years with payment frequency 0.5 corresponding 
to a ten period model. 

Meanwhile, for each rating class, an optimization on {tifsiIsiPr) maximizing the 
difference between equity tranche spread for the normal copula and our model, and 
senior tranche spreads for our model and the normal copula is run. Parameter rjp is 
constrained so that both the one-year default probability and the mezzanine spread 
are matched. 

Figure [9] shows the output of one such optimization run. It demonstrates that 
even with a flat correlation value for the graphical model, one can obtain lower prices 
for mezzanine tranche and higher for the senior tranches in comparison to the normal 
copula, thus "correcting" the correlation smile. 



Graphical 
Normal 



0-3 3-7 7-10 10-15 15-30 

Tranche 




3-7 7-10 10-15 15-30 

Tranche 



(a) High-rating CDO (b) Low-rating CDO 

Figure 9: Tranche spreads for graphical and normal copula models 



4 Conclusion 

This paper proposes and analyzes a simple graphical model for modelling correlated 
default. The graphical representation provides an effective shorthand to depict the 
dependence relationships between the N firms, with the desirable conditional inde- 
pendence property. The probability distribution proposed is a toric model which is 
beneficial in both parameter estimation and simulation. With some homogeneity as- 
sumptions, loss distributions and CDO prices are obtained analytically. The model 
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generates heavy tails in the loss distribution, and its dynamic formulation seems 
promising for correcting the correlation smile observed in one-factor normal copula. 
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A Birch's Theorem 

Our proof follows that in [T9] , and begins with a lemma. 

Lemma 12. Let A = (a^) be a real dxm matrix of rank d and pos(A) the W>Q-span 
of its columns a.i, . . . , a. m . Let t 1 , . . . , t m e IR >0 be real positive numbers and define 



3=1 

That is, F(j}) = A{t l e ifl - U ' n \ . . . , t m e^°' m, ^) T . Then F determines a real analytic 
isomorphism ofM. d onto the interior ofpos(A). 

Proof. First, the fact that F is an injective local isomorphism with image points 
arbitrarily close to the extreme rays of pos(A) is established. Then the result will 
follow from an inductive proof that im(F) is convex. 



F : R d -> M' 



d 



in 




We have F(rj)i = ^Pe^a), so 
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so that the Jacobian is symmetric. Moreover, the quadratic form is given by 

which is strictly positive for x 7^ since the span R d , so some a* 7^ 0. This shows 
that F is a local isomorphism. To show it is injective, it is sufficient to check that 
F is injective on a line, and by a change of coordinates this reduces the problem to 
the case d — 1. In this case, the a'j are scalars aj and F sends 77 G R to Paje a i Vi , 
with strictly positive derivative as above. pos(/4) is either [0, 00), (00, 0], or (—00, 00) 
depending on the signs of the a,j, and F is an isomorphism of manifolds. Thus the 
d = 1 case shows F is injective and will also serve as the base case for our induction. 

By grouping the a'j and changing the t J , one may assume no two a,- lie on the 
same ray. Suppose a\ generates an extreme ray of pos(yl). Then one may choose v 
such that (v, %) = but (v, a x ) < for j ^ 1. Then F(Xv + 77) = tM^ai + • • • + 
t 1 e^ a i)+ A ^ a -)a m , so 

Mm F(Xv + rf) = t L e^ ) a 1 

A^oo 

so that one can approach any point on M>oa'i arbitrarily closely by adjusting rj. 

It remains to show that the image of F is convex. Suppose im(F) is convex for 
d— 1, and let L be a line in M d ; to show im(F) is convex for d, one must show that 
L fl im(F) is connected or empty. One can write L = 7i~ 1 q for a suitable projection 
7T : M d -> M^" 1 and point gel", and 

im(F) ni = n tT 1 ^) = F((tt o F)" 1 ^)). 

By a linear change of coordinates in R d , one may assume that ir is projection onto 
the first d — l coordinates. Let p denote the projection to the last coordinate, n 1— > i] d . 
Let G = n o F and for y G M let G y be the restriction of G to p -1 ^. 

This defines a map G y : R^ 1 -> R d_1 . G y (r]i, . . . ,r] d -i) = ^JW^aj with 

s J = £ 3 'e I/0 -» > 0. Still the columns of its defining matrix yi (A without its last 
row) span so G y meets the hypotheses of the theorem for d — l. Thus each 

G y is an injective map onto int(7r(pos(/4))). Then for each q in int(7r(pos(/4))), the 
projection G^ 1 (q) = (no F)^ 1 (q) to K induced by p is a bijection. So the intersection 
is connected. □ 

Now Lemma [12] can be applied to polytopes. 

Proposition 13. Let A = (a^) be a realdxm matrix, and K be the convex hull of its 
columns a.i, . . . , a. m . Further require that the a.j not be contained in any hyperplane. 
Let ti, . . . ,t m G 1R >0 be real positive numbers and define 

H : R d -> R d 
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where Z(rj) = tie^' 11 ' + ■ • ■ t m e^ m '^. T/ien H defines a real analytic isomorphism of 
M rf onto the interior of K. 

Proof. Form the cone over K in M. d+1 , letting dj = aj, . . . , a^, 1. Let 

F : -> 

j 

since the a, were not contained in any hyperplane, after lifting they still span R rf+1 . 
Then by Lemma [12j F maps IR d+1 isomorphically onto int(7r(pos(A))). The last 
coordinate of F is J2j P e^ 3 e Vd+1 ; this is equal to 1 when 77^+1 = — log(Z(ry)). Thus 
= -F(?7, — log(Z(r7))) maps M d isomorphically onto int(K). □ 

Note that if all a/s lie in a hyperplane (such as when the column sums of A are 
equal), coordinates can be changed so that the last row of A is all ones and the 
do not lie in a hyperplane. 



B Proof of Corollary 

The proof is from [37J Proposition 1.9]. First assume that P, is a vector of marginal 
default probabilities and linear correlations with M+ \E\ + 1 coordinates, where the 
last coordinate is 1. This implies that there exists a q G {p : Y^w=iVw = 1} such 
that Acq = P,- Now take u = q. Equation (jSJ) can equivalently be written as: 

Maximize 9 A ° U subject to 9 G M> +|B| and ^ = 1 

3=1 

where: 



£| 2 M M+|-E| M+ 
iii«iH ha - ,Mtl 9 M 



^:= I [ Jj0^= J~{ < a " 1+ '" +ai ' 2MU2Af and ^ - f{ ^ 

i=l j'=l i=l i=l 

Writing b = Aqu for the sufficient statistic, our optimization problem is: 

2 m 

Maximize # 6 subject to 9 G R> +|i?l and ^9^ = 1 (30) 
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Using f(9) := ( Pl (log9),--- 



p 2 M (log 6)) with p w (.) given by Equation (CQ): 



Proposition 14. Let p = f(9) be any local maximum for the problem l[30]) . Then: 



A G p = b 



Proof. Introduce a Lagrange multiplier A. Every local optimum of (I30I) is a critical 
point of the following function in M + \E\ + 1 unknowns 6\, • • • , 9m+\e\, A: 



This says that the vector A G p is a scalar multiple of the vector b = Aqu. Since the 
last row of Aq is assumed to be (1, ■ • • ,1), and last element of b to be 1, Aqp = b. □ 

As the matrix Aq is assumed to have full row rank, the proof of Corollary [5] 
follows from Theorem [H which states that the parameters satisfying Aqp(t]) = P, 
are unique. 




Apply the scaled gradient operator 




to the function above. The resulting critical equations for 6 and p state that 



j aj = XA G p 
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